COVID-19 Vaccinations in the United States: A Study
Project Description
Introduction
It’s no secret that COVID-19 has dramatically changed our lives over the course of the past year. It’s only recently, as vaccines have started to become available, that society has begun to return to normal.
When we sat down to discuss what to study, we were all drawn to the dramatic impact of vaccinations. And although different aspects of the issue, ranging from differential state effects, to political correlations, to adaptations over time, caught our attention, it was clear that we wanted to study vaccination data. We thought we’d turn to the United States, as our previous project had a more international focus.
Once we’d settled on a topic, we began to look for data sources. In the end, we ended up with six datasets. The first two focused on state level vaccination data, in California and Tennessee respectively. In a similar vein, there was a dataset that contained information about Tennessee on a county-wide level. The next two were more general, containing basic information about doses administered over time: one was from the CDC, one was from our world in data. Finally, we used a dataset from the MIT Elections Lab to learn about past political results.
From there, we moved to analysis. That work was broken into three main sections. They are all easily accessible via tabs on the left-hand side of your screen. Enjoy!
Comparison: CA and TN
An issue central to policy response to the COVID-19 pandemic is how to efficiently, equitably, and safely administered vaccinations among subpopulations of a given country’s citizens. Especially given the race-related events, movements, and justice and injustices in the past year, equitable vaccine distribution among racial groups is a very important, highly studied topic. Below, we will analyze vaccine administration by race in California and Tennessee, two historically partisan states, to see how well racially just policies and public health hopes are faring as they are put into practice. To calculate the proportion of racial groups, I scaled vaccine counts by racial subpopulations in California and Tennessee.
Unfortunately, the datasets available on California and Tennessee vaccinations had different racial categories and fineness thereof. Regardless, there are important conclusions to draw from the data. It is clear that in both California and Tennessee, Black Americans received COVID-19 vaccinations at the lowest rate. This trend began from the beginning of vaccine administration in both cases, demonstrating a clear racial inequity for this group.
Similarly, most other groups of people of color in California were vaccinated at rates below Caucasian Californians (except for Asian Californians and Native Hawaiian/Pacific Islander Californians). This suggests that Black and Brown communities encounter barriers to receiving vaccinations. Asian Californians appear to receive vaccines at similar rates to Caucasian Californians, which suggests a possibly less stark set of systemic barriers keeping Asian Californians from accessing the COVID-19 vaccine. Lastly, Native Hawaiian or Pacific Islander Californians appear to have consistently received the COVID-19 vaccine from the start. One theory for this occurrence is that Native Hawaiians whose tribal status is federally or regionally recognized may be able to receive the COVID-19 vaccine more easily via Indian Health Service vaccination clinics. However, this theory does not align with the low rates that non-Hawaiian Natives appear to be receiving vaccines.
In Tennessee, we also see a similar vaccination rate between Asian Tennesseeans and Caucasian Tennesseeans. This also suggests that Asian Tennesseeans encounter less stark systemic race-related barriers to COVID-19 vaccination. However, Black Tennesseeans appear to receive COVID-19 vaccinations at lower rates than their non-Black counterparts, which suggests that vaccine distribution between Black and non-Black communities in Tennessee is not equitable.
Lastly, it is important to note that in general, the overall vaccination rates in Tennessee are higher than in California. I theorize that this is because the Tennessee dataset did not specify whether vaccine counts meant fully vaccinated persons or vaccine doses administered, whereas the California dataset supplied counts of fully vaccinated persons. Since the overall purpose of this visualization was to compare rates between racial groups, and I believe that it is a fair assumption to say that racial groups and vaccine type (i.e. one-dose series or two-dose series) are not confounded, I hope that you all will find this visualization adequate.
The above graph shows the change in total vaccine doses in California between February 15 and April 15, 2021. I chose to analyze total doses instead of proportion of populations to demonstrate the incredibly high need for vaccine doses in highly populated areas, such as Los Angeles County (colored dark purple in the April 15 map). It appears that total doses administered falls highest in more populous counties, specifically those that encircle the Bay Area, Los Angeles, San Diego, Sacramento, and the Central Valley. If you look closely, you can see that some less populous counties saw a change in vaccine dosage between February 15 and April 15, but that that change is much smaller than the changes in populous areas. This can be explained by three main factors, the first (and most obvious) that vaccine counts should be lower where there are less people, the second that rural areas tend to be more conservative and conservative populations tend to have more anti-vaccine individuals in them, and the third (and most important) that public health infrastructure in rural areas tends to be less thorough, presenting many access issues for agricultural communities, Native communities, elderly people, and other populations that live in rural areas.
Lastly, the above graph shows the change in total vaccine doses in Tennessee between February 15 and April 15, 2021. It appears that total doses administered falls highest in more populous counties, specifically those that encircle Nashville, Memphis, Knoxville, and Chattanooga. Similar as to in California, it appears that less populous counties saw a small change in vaccine dosage between February 15 and April 15. This suggests that rural counties in Tennessee face challenges in administering vaccine doses to their populations that are similar to those in rural Californian counties.
Politics and Vaccinations
Data and Methods
Much has been made of the possible correlation between political inclination and vaccine hesitancy, a phenomenon where people refuse to get their COVID-19 vaccination shot, even though the innoculation has been proven sound from a medical perspective. Articles such as this one from Fortune Magazine and this one from the New York Times allege that Trump voters and, to some extent, Republicans, are more likely to refuse vaccination.
To visualize this idea, I drew from two data sources. First, I relied on the CDC’s vaccination records. I was comfortable with the reliability of that source. I also drew from the MIT Election Lab, which provided me a comprehensive list of state’s voting records, including during the 2020 presidential election, which I was most interested in.
From there, it was a matter of data-wrangling: I merged the two datasets, selected the variables I was going to use, and created the required new variables. The first of these was a ratio of vaccines delivered vs. administered, per state. The second was an indicator variable, which allowed me to see how a state voted in 2020 (Trump vs. Biden).
After that, it was a matter of visualization. I proceeded in four steps.
Look at the big picture. For this, I made a map of state-level vaccination administration ratios, then compared it to a map that showed the percentage of people vaccinated.
The above convinced me of the importance of understanding what leads to low administration ratios, as much of the time, that appeared to be a barrier to achieving high levels of per-capita vaccination. To confirm this inkling, I turned to a scatterplot. Once satisfied there, I decided to see if my factor of interest, politics, could play a role.
I created a map of the United States, colored by politics, then shaded by ratio. I thought I saw a clear trend towards Republican states being worse at administering the vaccines they were given, but I wasn’t to investigate further.
Looking at a boxplot, I confirmed my visual assumption: Republican states had a much lower median ratio.
These steps, and conclusions, are described in more detail below.
Initial Maps, Scatterplot
Below are the first two visualizations I created, meant to justify my investigation of politcal impacts on vaccination ratios.
Political Map, Boxplot
Tracking Statewide Vaccinations
When vaccines were first starting to be administered, each state had their own guidelines for eligibility, meaning who was allowed to get vaccinated. Most states started with first-line-responses, such as doctors, nurses, and other medical workers. Then, they expanded to the state’s most vulnerable populations, and continued to extend the guidelines, using age, occupation, and preexisting health conditions as a marker for who would be allowed to get vaccinated. Each state followed a similar process, and each state was allocated a certain number of vaccine doses each week, the amount proportionate to their population.
However, we see that some states opened up their guidelines much faster than other states. For instance, on one side of the spectrum, New Jersey was one of the last states to open up vaccine eligibility to everyone 16 and older, on April 19th. On the other side, Alaska, Alabama, and Arkansas opened up on March 9th, March 24th, and March 30th, respectively (US News).
This led me to my main questions of:
How did each state differ in the number of doses they were administering each day?
Are some states vaccinating their populations at a faster rate than other states?
To answer these questions, I decided to make a time lapse of the United States measuring the cumulative number of doses that were given overtime in each state. I started January 13th, and went up to May 15th of 2021, cumulatively adding the number of doses given each day in each state.
For instance, if on January 13th, there were 13361 doses given in Massachusetts, and 14697 doses given on January 14th in Massachusetts, the number that I measured for January 13th would be 13361 and the number that I measured for January 14th would be 13361 + 14697 = 28058.
Below is the main code that I used to make the time lapse visualization. First, I merged the vaccine dataset that I created (kv_states) with the usa_states dataset which contains the information to make a map of the United States. Then, I mapped that new dataset, called usa_vaccine_states, and used gganimate to bring it from a static map to a timelapse.
usa_states <- map_data(map = "state", region = ".")
# merging the US map dataset with the vaccine dataset
# using inner_join to remove alaska & hawaii
usa_vaccine_states <- kv_states %>%
inner_join(usa_states, by = c("state" = "region"))
#animate the map to become a time lapse/gif
anim_map_us <- usa_vaccine_states %>%
ggplot() +
geom_polygon(aes(x = long, y = lat, group = group, fill = cumulative_vaccinations)
,color = "white") +
theme_void() +
coord_fixed(ratio = 1.3) +
labs(fill = "Number of People Vaccinated"
, title = "USA Daily Vaccinations by State"
, subtitle = "{closest_state}"
theme(legend.position="right") +
scale_fill_distiller(palette = "YlGnBu", direction = "horizantle") +
# using gganimate's transition_states to inform how the map should move
transition_states(state = Day, transition_length=2
, state_length = 5)
anim_map_us
#adjusting the number of frames to cycle through in order to cycle through all of the dates
animate(anim_map_us, nframes = 2*length(unique(usa_vaccine_states$Day))
, renderer = gifski_renderer("kriti/usa_vaccine_number.gif"))
Here is the result:
Cumulative Number of Vaccine Doses Delivered by State
As we see on the time lapse, it seems that California, Texas, Florida, and New York consistently delivered the most vaccine doses from the beginning of the vaccine rollout. For instance, on January 20th, these states had begun to turn green, while the other states were still yellow.
Cumulative Number of Vaccine Doses Delivered by State
And, the trend continues throughout the time lapse as well. As we start to see how the other states’ cumulative number of vaccine doses given changes from January to May, we can see that in general, the right half of the country seems to be on the upper side of the scale, while the left half of the country is on the lower end of the scale.
However, it’s important to note that while California, Texas, Florida, and New York have the most vaccine doses given, they also have the biggest populations. Because they have the biggest populations, it makes sense that they would get a larger number of doses allocated to them, which means that they would give a larger number of doses each day, meaning their cumulative number of vaccine doses given will be very high.
So, in order to be as accurate as possible in my conclusions (and answer my second question), I decided to make another map that tracked the cumulative number of vaccine doses given as a percent of state population. That way, I could account for the population when measuring how fast each state was vaccinating their populations. That way, a lower populated state would not be overshadowed by larger states if the proportion of cumulative vaccine doses given was the same as a larger state, even if the raw number was much lower.
In order to do this, I created a new column that took the cumulative number of vaccines for each day, and divided it by the state population for each day in each state.
From there, I used the same process as I did for the first timelapse.
Here is the result:
Cumulative Number of Vaccine Doses Delivered by State as a Percent of Population
While I was expecting to see an evening out of the map, I was surprised to see how much the map standardized. In the previous map, measuring the number of vaccinations, the map had states from all different places on the scale. However, in the second map, all of the states were a shade of dark blue, which meant that their cumulative numbers of vaccines doses given as a percent of population were all close to each other, and close to the higher end of the scale.
Lastly, I created a shiny app that represents the same information in line graph form, so that someone can more easily compare vaccine doses given within different states. The user is able to click through and decide which variable they want to measure. Below is a description of the variables:
Cumulative Vaccinations: the cumulative number of vaccine doses given each day in a state
Cumulative Vaccines Delivered as a Percent of Population: the cumulative number of vaccine doses given each day in a state divided by the state population, multiplied by 100
Vaccinations Administered per Day: the daily number of vaccinations delivered each day in a state
Vaccinations Administered per Day as a Percent of Population: the daily number of vaccinations delivered each day in a state divided by the state population, multiplied by 100
Limitations
As a limitation, I would like to note that second time lapse map is measuring the cumulative number of vaccine doses administered as a percent of population, and does not necessarily accurately reflect the number of people fully/partially vaccinated. The data I used only had the number of vaccine doses given per day, and since Pfizer-BioNTech and Moderna require 2 doses to be fully vaccinated, while Johnson and Johnson only requires one shot, I did not have a way to extract information needed to measure the number of people fully/partially vaccinated in each state. I do think it would be extremely informative to have a map that did have the number of people fully vaccinated as a measurement.
Conclusion
We discovered several interesting patterns with regard to vaccination in the United States.
Lastly, we found that while there has been a lot of discussion within the United States about different states moving faster than others when it comes to vaccinations, most states are giving vaccination doses at a similar rate when taking into account their populations, even if raw numbers differ widely.